Improving counting Bloom filter performance with fingerprints
نویسندگان
چکیده
a r t i c l e i n f o a b s t r a c t Bloom filters (BFs) are used in many applications for approximate check of set membership. Counting Bloom filters (CBFs) are an extension of BFs that enable the deletion of entries at the cost of additional storage requirements. Several alternatives to CBFs can be used to reduce the storage overhead. For example schemes based on d-left hashing or Cuckoo hashing have been proposed. Recently, also a new type of CBF, the Variable Increment Counting Bloom Filter (VI-CBF) has been introduced to improve performance. The VI-CBF uses different increments in the filter counters to reduce the false positive rate and therefore the storage requirements. In this paper, another mechanism to improve CBF performance: the Fingerprint Counting Bloom Filter (FP-CBF) is presented. The proposed scheme is based on the use of fingerprints on the filter entries to reduce the false positive rate. This results in a simpler implementation than VI-CBFs in terms of number of hash functions and arithmetic operations. The false positive rate of the proposed scheme has also been analyzed theoretically and by simulation and compared with the VI-CBF. The results show that the proposed scheme can achieve lower false positive rates than those of a simple VI-CBF implementation. When compared with a better and more complex VI-CBF implementation, the FP-CBF outperforms it when the number of bits per element is large while the VI-CBF is better for low number of bits per element.
منابع مشابه
A Cuckoo Filter Modification Inspired by Bloom Filter
Probabilistic data structures are so popular in membership queries, network applications, and so on. Bloom Filter and Cuckoo Filter are two popular space efficient models that incorporate in set membership checking part of many important protocols. They are compact representation of data that use hash functions to randomize a set of items. Being able to store more elements while keeping a reaso...
متن کاملAn Efficient Data Fingerprint Query Algorithm Based on Two-Leveled Bloom Filter
The function of the comparing fingerprints algorithm was to judge whether a new partitioned data chunk was in a storage system a decade ago. At present, in the most de-duplication backup system the fingerprints of the big data chunks are huge and cannot be stored in the memory completely. The performance of the system is unavoidably retarded by data chunks accessing the storage system at the qu...
متن کاملBloom Filters via d - Left Hashing and Dynamic Bit Reassignment Extended
In recent work, the authors introduced a data structure with the same functionality as a counting Bloom filter (CBF) based on fingerprints and the d-left hashing technique. This paper describes dynamic bit reassignment, an approach that allows the size of the fingerprint to flexibly change with the load in each hash bucket, thereby reducing the probability of a false positive. This technique al...
متن کاملAutoscaling Bloom Filter: Controlling Trade-off Between True and False Positives
A Bloom filter is a simple data structure supporting membership queries on a set. The standard Bloom filter does not support the delete operation, therefore, many applications use a counting Bloom filter allowing the deletion. This paper proposes a generalization of the counting Bloom filters approach, called “autoscaling Bloom filters”, which allows elastic adjustment of its capacity with prob...
متن کاملRobust Detection and Tracking of Long-range Target in a Compound Framework Kang Sun and Xinwei Li A Study on the Using Behavior of Depot-Logistic Information System in Taiwan: An Integration of Satisfaction Theory and Technology Acceptance Theory
The function of the comparing fingerprints algorithm was to judge whether a new partitioned data chunk was in a storage system a decade ago. At present, in the most de-duplication backup system the fingerprints of the big data chunks are huge and cannot be stored in the memory completely. The performance of the system is unavoidably retarded by data chunks accessing the storage system at the qu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Process. Lett.
دوره 116 شماره
صفحات -
تاریخ انتشار 2016